skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Carey, Michael"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Permafrost thaw alters groundwater flow, river hydrology, stream‐catchment interactions, and the availability of carbon and nutrients in headwater streams. The impact of permafrost on watershed hydrology and biogeochemistry of headwater streams has been demonstrated, but there is little understanding of how permafrost influences fish in these ecosystems. We examined relations among permafrost characteristics, the resulting changes in water temperature, stream hydrology (e.g., discharge flashiness), and macroinvertebrates, with the abundance, biomass, and energy density of juvenile Dolly Varden (Salvelinus malma) and Arctic Grayling (Thymallus arcticus) across 10 headwater streams in northwestern Alaska. Macroinvertebrate density was driven by concentrations of dissolved carbon and nutrients supporting stream food webs. Dolly Varden abundance was primarily related to water temperature with fewer fish in warmer streams, whereas Dolly Varden energy density decreased with the flashiness of the headwater streams. Dolly Varden biomass was related to both temperature and bottom‐up food web effects. The energy density of Arctic Grayling decreased with warmer temperatures and discharge flashiness. These relations demonstrate the importance of terrestrial–aquatic connections in permafrost landscapes and indicate the complexity of landscape effects on fish. Because permafrost thaw is one of the most impactful changes occurring as the Arctic warms, an improved understanding of how stream temperature, hydrology, and bottom‐up food web processes influence fish populations can aid forecasting of future conditions across the Arctic. 
    more » « less
    Free, publicly-accessible full text available May 1, 2026
  2. Efficient multi-join query processing is crucial but remains a complex, ongoing challenge for high-performance data management systems (DBMSs). This paper studies the impact of different memory distribution techniques among join operators on different classes of multi-join query plans under different assumptions regarding memory availability and storage devices such as HDD and SSD on Amazon Web Services (AWS). We re-evaluate the results of one of the early impactful studies from the 1990s that was originally done using a simulator for the Gamma database system. The main goal of our study is to scientifically re-evaluate and build upon previous studies whose results have become the basis for the design of past and modern database systems, and to provide a solid foundation for understanding basic "join physics", which is essential for eventually designing a resource-based scheduler for concurrent complex workloads. 
    more » « less
    Free, publicly-accessible full text available November 20, 2025
  3. The increasing prevalence of large graph data has produced a variety of research and applications tailored toward graph data management. Users aiming to perform graph analytics will typically start by importing existing data into a separate graph-purposed storage engine. The cost of maintaining a separate system (e.g., the data copy, the associated queries, etc …) just for graph analytics may be prohibitive for users with Big Data. In this paper, we introduce Graphix and show how it enables property graph views of existing document data in AsterixDB, a Big Data management system boasting a partitioned-parallel query execution engine. We explain a) the graph view user model of Graphix, b) gSQL++ , a novel query language extension for synergistic document-based navigational pattern matching, and c) how edge hops are evaluated in a parallel fashion. We then compare queries authored in gSQL++ against versions in other leading query languages. Finally, we evaluate our approach against a leading native graph database, Neo4j, and show that Graphix is appropriate for operational and analytical workloads, especially at scale. 
    more » « less
  4. Abstract Climate change in the Arctic is altering watershed hydrologic processes and biogeochemistry. Here, we present an emergent threat to Arctic watersheds based on observations from 75 streams in Alaska’s Brooks Range that recently turned orange, reflecting increased loading of iron and toxic metals. Using remote sensing, we constrain the timing of stream discoloration to the last 10 years, a period of rapid warming and snowfall, suggesting impairment is likely due to permafrost thaw. Thawing permafrost can foster chemical weathering of minerals, microbial reduction of soil iron, and groundwater transport of metals to streams. Compared to clear reference streams, orange streams have lower pH, higher turbidity, and higher sulfate, iron, and trace metal concentrations, supporting sulfide mineral weathering as a primary mobilization process. Stream discoloration was associated with dramatic declines in macroinvertebrate diversity and fish abundance. These findings have considerable implications for drinking water supplies and subsistence fisheries in rural Alaska. 
    more » « less
    Free, publicly-accessible full text available December 1, 2025
  5. Join operations are crucial in data analysis, but can suffer inefficiency with large datasets and complex non- equality-based conditions. Optimized join algorithms have gained traction in database research to address these challenges. One popular choice for implementing join algorithms is distributed data processing frameworks, e.g., Hadoop and Spark, but each implementation is highly tailored for specific query types. As a result, they do not address join queries that involve diverse and complex conditions since they are not integrated into a holistic query optimization engine like in DBMSs. On the other hand, implementing new join algorithms on a DBMS from scratch requires substantial effort and expertise. This paper introduces FUDJ, Flexible User-defined Distributed Joins, a framework for complex distributed join algorithms. The key idea of FUDJ is to allow developers to realize new distributed join algorithms into the database without delving into the database internals. As shown, an algorithm implemented in FUDJ is up to an order of magnitude faster than existing user-defined implementations with an order of magnitude fewer lines of code. 
    more » « less
  6. Join operations are crucial in data analysis, but can suffer inefficiency with large datasets and complex non-equality-based conditions. Optimized join algorithms have gained traction in database research to address these challenges. One popular choice for implementing join algorithms is distributed data processing frameworks, e.g., Hadoop and Spark, but each implementation is highly tailored for specific query types. As a result, they do not address join queries that involve diverse and complex conditions since they are not integrated into a holistic query optimization engine like in DBMSs. On the other hand, implementing new join algorithms on a DBMS from scratch requires substantial effort and expertise. This paper introduces FUDJ, Flexible User-defined Distributed Joins, a framework for complex distributed join algorithms. The key idea of FUDJ is to allow developers to realize new distributed join algorithms into the database without delving into the database internals. As shown, an algorithm implemented in FUDJ is up to an order of magnitude faster than existing user-defined implementations with an order of magnitude fewer lines of code. 
    more » « less
  7. SQL is five decades old and has outlasted many programming and query languages that have come and gone during its lifetime. It was born shortly after the introduction of the relational model, and was designed for querying a flat and typed tabular world. Support for modern, flexible data in the SQL standard and in relational database systems has largely been approached via the addition of new column types (e.g. XML or JSON) together with functions to operate on them. It is time for a cleaner solution that retains the benefits that have allowed SQL to be so successful for so long. We describe SQL++, a SQL extension that relaxes SQL's strictness in terms of both object structure (flat → nested) and schema (mandatory → optional), along with a multi-party effort to agree on a core definition and syntax supportable by multiple vendors. SQL++ sees relational data as a subset of a more flexible object model and it sees collections of document data (e.g., JSON) as a natural and supportable relaxation as opposed to a “bolt on” addition via a SQL column type. We describe the core features of SQL++ and explain how its definition can accommodate flexible data, while staying true to SQL in situations where the target data is tabular and strongly typed. Index Terms-semistructured data, query, JSON, SQL, NoSQL 
    more » « less
  8. Abstract Window queries are important analytical tools for ordered data and have been researched both in streaming and stored data environments. By incorporating ideas for window queries from existing streaming and stored data systems, we propose a new window syntax that makes a wide range of window queries easier to write and optimize. We have implemented this new window syntax in SQL++, an SQL extension that supports querying semistructured data, on top of AsterixDB, a Big Data Management System, thus allowing us to process window queries over large datasets in a parallel and efficient manner. 
    more » « less
  9. Effective query optimization remains an open problem for Big Data Management Systems. In this work, we revisit an old idea, runtime dynamic optimization, and adapt it to a big data management system, AsterixDB. The approach runs in stages (re-optimization points), starting by first executing all predicates local to a single dataset. The intermediate result created by a stage is then used to re-optimize the remaining query. This re-optimization approach avoids inaccurate intermediate result cardinality estimates, thus leading to much better execution plans. While it introduces overhead for materializing intermediate results, experiments show that this overhead is relatively small and is an acceptable price to pay given the optimization benefits. 
    more » « less